Discriminative rescoring based on minimization of word errors for transcribing broadcast news

نویسندگان

  • Akio Kobayashi
  • Takahiro Oku
  • Shinichi Homma
  • Shoei Sato
  • Toru Imai
  • Tohru Takagi
چکیده

This paper describes a novel method of rescoring that reflects tendencies of errors in word hypotheses in speech recognition for transcribing broadcast news, including ill-trained spontaneous speech. The proposed rescoring assigns penalties to sentence hypotheses according to the recognition error tendencies in the training lattices themselves using a set of weighting factors for feature functions activated by a variety of linguistic contexts. Word hypotheses with low possibilities of correct words are penalized while those with high possibilities are rewarded by the weighting factors. We introduce two types of training techniques to obtain the factors. The first is based on conditional random fields (CRFs), and the second is based on the minimization of word errors, which explicitly reduces expected word errors. The results of transcribing Japanese broadcast news achieved a word error rate (WER) of 10.38%, which was a 6.06% reduction relative to conventional lattice rescoring.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word error rate minimization using an integrated confidence measure

This paper describes a new criterion of speech recognition using an integrated confidence measure for minimization of the word error rate (WER). Conventional criteria for WER minimization obtain an expected WER of a sentence hypothesis merely by comparing it with other hypotheses in an n-best list. The proposed criterion estimates the expected WER by using an integrated confidence measure with ...

متن کامل

Risk-Based Semi-Supervised Discriminative Language Modeling for Broadcast Transcription

This paper describes a new method for semi-supervised discriminative language modeling, which is designed to improve the robustness of a discriminative language model (LM) obtained from manually transcribed (labeled) data. The discriminative LM is implemented as a log-linear model, which employs a set of linguistic features derived from word or phoneme sequences. The proposed semi-supervised di...

متن کامل

An Empirical Study of Word Error Minimization Approaches for Mandarin Large Vocabulary Continuous Speech Recognition

This paper presents an empirical study of word error minimization approaches for Mandarin large vocabulary continuous speech recognition (LVCSR). First, the minimum phone error (MPE) criterion, which is one of the most popular discriminative training criteria, is extensively investigated for both acoustic model training and adaptation in a Mandarin LVCSR system. Second, the word error minimizat...

متن کامل

Towards automatic closed captioning : low latency real time broadcast news transcription

In this paper, we present a low latency real-time Broadcast News recognition system capable of transcribing live television newscasts with reasonable accuracy. We describe our recent modeling and efficiency improvements that yield a 22% word error rate on the Hub4e98 test set while running faster than real-time. These include the discriminative training of a feature transform and the acoustic m...

متن کامل

Applying a Grammar-Based Language Model to a Simplified Broadcast-News Transcription Task

We propose a language model based on a precise, linguistically motivated grammar (a hand-crafted Head-driven Phrase Structure Grammar) and a statistical model estimating the probability of a parse tree. The language model is applied by means of an N-best rescoring step, which allows to directly measure the performance gains relative to the baseline system without rescoring. To demonstrate that ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008